Posted on 06/08/2017 at 11:00 PM by The Vibe.
Throughout this week, I was working on RDS & RData Importer-Exporter modules. Though it was quite tedious to integrate RSRuby gem and Renviron with Travis CI, I was able to finally excel with the Travis CI build after 10 days & 65+ commits. Also, Game of Thrones S07E04 got leaked later this week, and it was definitely the best episode in the season so far with an amazing 9.9/10 rating on IMDb.
The RSRuby gem makes it incredibly simple to call statements into R, with R.eval_R()
method. Thus,
commands like saveRDS()
for RDS Exporter, save()
for RData Exporter, readRDS()
for RDS Importer
and load()
for RData Importer. Hence, I've tackled the export of Ruby Daru::DataFrame
objects into R
data.frame
objects and import of R data.frame
objects back into Ruby Daru::DataFrame
objects by
computing an Array of Strings that'll be executed by the R.eval_R()
method. Here are a few code
snippets that explain the working of the R IO modules -
def process_statements(r_variable, dataframe)
[
*dataframe.map_vectors_with_index do |vector, i|
"#{i} = c(#{vector.to_a.map { |val| convert_datatype(val) }.join(', ')})"
end,
"#{r_variable} = data.frame(#{dataframe.vectors.to_a.map(&:to_s).join(', ')})"
]
end
def rds_exporter
@instance = RSRuby.instance
@statements = process_statements(@r_variable, @dataframe)
@statements << "saveRDS(#{@r_variable}, file='#{@path}')"
@statements.each { |statement| @instance.eval_R(statement) }
end
def rdata_exporter
@instance = RSRuby.instance
@statements = @options.map do |r_variable, dataframe|
process_statements(r_variable, dataframe)
end.flatten
@statements << "save(#{@options.keys.map(&:to_s).join(', ')}, file='#{@path}')"
@statements.each { |statement| @instance.eval_R(statement) }
end
The working of the RData IO modules are quite similar to the code snippets given above. These modules have been approved and merged. Progress related to these IO modules can be tracked in this Pull Request.
Until the previous week, I was quite doubtful about whether Avro files contain both schema and data, or just schema. The good news is that, things have quite fallen into place now after discussing with mentor Victor Shepelev (zverok) in this issue tracker. Support for the Avro IO module has been planned to be provided with the avro gem. A small code snippet that was used in the Avro Importer.
buffer = StringIO.new(File.read('path/to/avro/file'))
data = Avro::DataFile::Reader.new(buffer, Avro::IO::DatumReader.new).to_a
Daru::DataFrame.new(data)
Progress related to this module can be tracked from this tree.
A sample Rails app is under development in
this repository,
to showcase the simplifications that are introduced by both
daru-io and
daru-view. Currently, an example has been added to
show the working of JSON Importer to obtain a Daru::DataFrame
from GitHub API, to create a DataTable,
Plot and even 'Export to {format}' buttons that use appropriate daru-io Exporters. Here are a few
screenshots that depicts this.